Simple and interpretable discrimination
نویسندگان
چکیده
Riassunto: Several authors proposed a number of approaches for constructing easily interpretable components still preserving the nice features of the principal components. The work employs one such approach to produce interpretable canonical variates and explore their discrimination and classification behavior. Discriminant analysis (DA) is a descriptive multivariate technique for analyzing grouped data, i.e. the rows of the data matrix are divided into a number of groups that usually represent samples from different populations (Fisher, 1936). Recently DA has also been viewed as a promising dimensionality reduction technique. Indeed, the presence of group structure in the data additionally facilitates dimensionality reduction. The best known variety of DA is linear discriminant analysis (LDA), whose central goal is to describe the differences between the groups in terms of discriminant functions defined as linear combinations of the original variables (Fisher, 1936). The interpretation of the discriminant functions is based on the coefficients of the original variables in the linear combinations. The problem is similar to interpretation of principal components (Trendafilov and Jolliffe, 2006): the interpretation can be clear if there are only few large coefficients and the rest are all close to or exactly zero. Unfortunately, in many applications this is not the case. There are several approaches to the interpretation of the discriminant functions, each of which has disadvantages (Rencher, 1992; Trendafilov and Jolliffe, 2006a). A modification of LDA aiming better discrimination and possibly interpretation is considered in (Krzanowski, 1995): the canonical variates are constrained to be orthogonal in the original data space. Recently, there is an increasing interest in approximate but computationally simple and easy to interpret versions of principal component analysis Roughly speaking, the aim of all these works is to find " components " with considerable number of exact zero coefficients. The reason is twofold: one may need such simplified PCA for either interpretation or coping with very large data. In this work we extend this strategy to LDA, where the interpretation problem seems even more serious that in PCA: indeed, the interpretation relies on three types of coefficients (raw, standardized and structure) which is quite often controversial (Rencher, 1992; Trendafilov and Jolliffe, 2006a). One such approach for interpretation of the discriminant functions is already proposed by Trendafilov and Jolliffe (2006a). The classical LDA problem is subject to additional LASSO constraints requiring that the sum of the absolute values of the coefficients be less than some pre-specified threshold t (Tibshirani, …
منابع مشابه
Feature Selection with Single-Layer Perceptrons for a Multicentre 1H-MRS Brain Tumour Database
A Feature Selection process with Single-Layer Perceptrons is shown to provide optimum discrimination of an international, multi-centre H-MRS database of brain tumors at reasonable computational cost. Results are both intuitively interpretable and very accurate. The method remains simple enough as to allow its easy integration in existing medical decision support systems.
متن کاملA NOTE TO INTERPRETABLE FUZZY MODELS AND THEIR LEARNING
In this paper we turn the attention to a well developed theory of fuzzy/lin-guis-tic models that are interpretable and, moreover, can be learned from the data.We present four different situations demonstrating both interpretability as well as learning abilities of these models.
متن کاملOn the use of spatial relations between objects for image classification
Image classification is addressed in this paper by utilizing spatial relation of detected objects in a rule-based fashion. Instances of particular object classes are detected combining bottom-up (learnable models based on simple features) and top-down information(object models consisting of primitive geometric shapes such as lines). The rulebased system acts as a model for the spatial configura...
متن کاملDetection and Discrimination of Theileria annulata and Theileria lestoquardi by using a Single PCR
The aim of this study was to detect and differentiate Theileria annulata and T. lestoquardi (hirci) by PCR. Members of the genus Theileria are tick-borne hemoprotozoan parasites those cause fatal and enervating diseases of cattle and sheep in Iran . In order to develop a specific method for detecting and identification of Theileria species, specific primers from the surface protein (SP) seque...
متن کاملInterpretable Policies for Dynamic Product Recommendations
In many applications, it may be better to compute a good interpretable policy instead of a complex optimal one. For example, a recommendation engine might perform better when accounting for user profiles, but in the absence of such loyalty data, assumptions would have to be made that increase the complexity of the recommendation policy. A simple greedy recommendation could be implemented based ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 53 شماره
صفحات -
تاریخ انتشار 2009